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, Abstract 

An unusual and surprising expansion of the form 

Pn = p"""^ (6n + ^ + l^n"^ + 51!^"^ + smaller order terms) , 

VO . as n — > c«, is derived for the probability Pn that two randomly chosen binary search trees 

^ I are identical (in shape and in labels of all corresponding nodes). A quantity arising in 

• ■ the analysis of phylogenetic trees is also proved to have a similar asymptotic expansion. 

^ . Our method of proof is new in the literature of discrete probability and analysis of algo- 

O ! rithms, and based on the psi-series expansions for nonlinear differential equations. Such 

^ ' an approach is very general and applicable to many other problems involving nonlinear 

^ ■ differential equations; many examples are discussed and several attractive phenomena are 

^ ! discovered. 
\^ • 
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1 Introduction 

The motivating problem. This paper was originally motivated by the following problem. 
Find the asymptotics of the sequence pn defined recursively by 

Pn = n^'^ ^ pjPn-i-j {n > 1). (1) 

0<j<n 
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with the initial condition po = 1. The sequence p„ is nothing but the probability that two 
randomly chosen binary search trees (BSTs) of size n are identical (having exactly the same 
shape and with the same labels for corresponding nodes), and was first studied by Martinez in 
Il26il as an auxiliary function for understanding the typical performance of the equality test of 
two random BSTs; see below for more background details. A minor variation of this sequence 
was encountered in the analysis of maximum agreement subtrees in [|71 under the Yule-Harding 
model. 

While shape parameters defined on a single random tree has been extensively studied in 
the literature for many varieties of trees, properties of statistics defined on a pair or multiple of 
random trees received comparatively less attention, partly because of the intrinsic complexity of 
the underlying analytic problems. Yet many practical situations (such as tanglegrams) naturally 
lead to such a study, typical example being the so-called "hereditary properties" or "recurrent 
properties", which in turn cover the equality, root occurrence, simplification rules, reduction 
rules, "clashes" as special cases; see [i26l |3T1 [T4l for more details. 

Recently, there has been more study of statistics defined on two random combinatorial 
objects; see [|6l and the references therein. 

Random BSTs. For completeness, we first describe BSTs. Given a sequence of distinct 
numbers . . . , we can construct the corresponding BST as follows. If n = 0, then the 
tree is empty. If n > 1, then we place xi at the root; the remaining numbers are compared 
one after another with xi, and are directed to the left subtree of the root if they are smaller, to 
the right subtree if larger. Numbers directed to each subtree are constructed recursively by the 
same procedure according to their original order; see Figure [T]for a plot. 




Figure 1: Left: the BST constructed from the sequence {6, 2, 4, 8, 7, 1, 5, 3, 10, 9}. Right: the 
root assumes the value j + 1 with equal probability 1/nfor j = 0, . . . , n — 1. 

By random BSTs, we assume that all n\ permutations of n distinct elements are equally 
likely, and construct the BST from a random permutation. Then we see that the root assumes 
the value j with probability 1/n for j = 1, . . . ,n, which is also the probability that the left 
subtree of the root has size j — 1. 

Definition: [Equality of two ordered, labeled trees]. Two ordered, labeled trees of the same 
size total number of nodes) are said to be equal or identical if either both trees are empty or 
they have common root label with all corresponding ordered subtrees equal. 

The definition extends to the equality of d trees with d > 2. 

Now we take two random BSTs independently, and our p„ gives the probability that the 
two trees are identical. Equivalently, we take two random permutations of n elements; then p„ 
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denotes the probability that the BSTs constructed from these two permutations are equal. (A 



simple example: (2, 1, 3) and (2, 3, 1) lead to the same BST of the shape 




A simple upper bound. The simple-looking recurrence O can be quickly estimated by the 
following inductive argument. If we assume the form p„ < c{n + l)g^'"'^^ for n > 0, then we 
see by induction that 



Pn< —g 2^ ij + l)in-j) = '-g " 



0<j<n 



6n 



In order that the rightmost term is less than c(n + 1)q " ^, we can take a positive integer uq, 
let c := 6rio/('^o + 2), and then choose g as 



6^0 (j + 1) 

0<j<rio \pj(no + 2) 



g := mm 



i/(i+i) 



Then we obtain 



Pn<-^{n + l)g—\ 
no + 2 



(2) 



for all n > 0. This gives successively improving bounds for g for increasing values of uq; see 
Table [H where we take only the first four digits after the decimal point without rounding. In 
particular, taking uq = 6 leads to the bound Pn ^ ^{n + 1)3~". The simple bound ^ obtained 
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Table 1 : Numerical values of g. 

by induction and numerical evidence suggest the possibility that ~ 6np^"^^ for some values 
of p ^ 3.14 (see Figure[2l). How to prove this? And is p = tt? 

The nonlinear differential equation. As the elementary argument we used above is not 
strong enough to derive more precise asymptotic approximations to p„, we consider instead 
the generating function P{z) := X]n>oP"^"' which satisfies the nonlinear differential equation 
(abbreviated throughout as DE) 

zP"{z) + P'{z)=P\z), (3) 

with the initial conditions P(0) = P'{0) = 1. This nonlinear DE is of Emden-Fowler type 
for which there is no explicit closed form solution; see [29 1. In addition to the apparent sin- 
gularity determined by the equation, the DE ^ also has singularities determined by the initial 
conditions, which are often referred to as the movable singularities. 
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Figure 2: The figures of —{logpn)/n (left) and — log(p„/ (6n + 18/5))/ {n + 1) (riglit). 

Frobenius method. Starting from tlie DE (|3]), tlie next step is often to apply the Frobenius 
method (see [|23l ). namely, we assume the solution of P{z) to be of the form 

P(;.) = 5^c,(l-^/p)^-", (4) 

for some a and p > 0, substitute this form into ([3]), and then determine a and the coefficients 
Cj inductively one after another. This classical procedure yields a = 2, cq = 6 / p, 

_ 12 _ 7 _ 14 _ 63 _ 161 

- ~5~p' - "25;^' - "125;^' - ~1250;^' - "937^- ^ ^ 

But then inconsistency arises since the coefficient of (1 — z/p)'^ on 

LHS of © = p2 (^12c6 + ^ RHS of © = (^I2c6 + , (6) 

and cg cannot be determined by simply matching the coefficients of both sides. This trial 
suggests that the local expansion of P near the singularity p will not be of the form (H]) and 
means that the classical Frobenius method fails for the nonlinear DE ©. 

Psi-series method. We will introduce a different type of expansion called psi-series expan- 
sion (or Painleve expansion; see \22i ) and it will turn out that P{z) admits an asymptotic 
expansion of the form 

U{Z):=Y,Z'-^ Yl ^^A^^&ZY, Z:=l-z/p, (7) 

j>0 0<£<b76j 



when z lies near the singularity p. This form, first conjectured by Martinez in 11271 Ch. 9], also 
explains why the expansion dH) leads to inconsistency. Thus z = p is not a pole but instead a 
pseudo-pole; see [22]. The first few terms of U (Z) are given as follows. 

^ ^ ' 5 25 125 1250 9375 

+ pc,Z' + pJ2 E 

j>7 0<e<[j/6i 
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for Z small, where ce := ce.o and the Cj/'s are polynomials of the parameter c%p with degree 
L(j-6£)/6jforj>7. 

The approach we use in this paper is roughly as follows. After checking the failure of 
Frobenius method, we construct a suitable psi-series U (Z) (by matching coefficients) so that 
V satisfies formally the DE ([3]). The series in (|7]) is a priori an asymptotic expansion, but we 
will show that it is indeed absolutely convergent in the cut-disk \ Z\ < 1 — e, Z ^ [—l + e,0]. 
Thus the function U is well defined there and satisfies the DE ([3]) and differs from P only by 
their initial conditions. Such a procedure still leaves undetermined two important parameters 
(similar to the initial conditions of the DE ([3])), one is obviously p and the other implicit one 
is Cg := Cg^o due to the same reason as the Frobenius method. This means that U is not only a 
function of Z, but also a function of p and Cg. 

Now to fix f/ in a unique way, we connect P{z) and U{Z) by first choosing a number 
zo G [ep, p — e], and by considering the solution (p, cg) of the two equations 

/ f/(Zo) = P(Zo) 

\ U'{Z,) = -pP'(^o), ^ ^ 

where Zq := 1 — zq/ p. We will show below (Proposition [B that, as a function of Z (or p) 
and Cg, the series U has a nonzero radius of convergence for each finite Cg. Also we can easily 
derive simple upper and lower bounds for p as above. Thus, as a standard initial- value problem, 
the system of equations ^ has a unique solution pair of (p, cg). This determines uniquely the 
pair (p, Cg) . Furthermore, P and U have a common region of analyticity, and we see by analytic 
continuation that U is the exact and asymptotic solution we have been looking for. 

Although no analytic forms for p and cg are available, we can compute the numerical values 
of p and Cg as follows. First, the values of f/(Zo) and U'{Zq) can be well approximated by their 
partial sums since the terms of the series converge in an exponential rate; see (flTl) : similarly, the 
values of P{zq) and P'{zq) can be computed by first computing p„ by its defining recurrence 
and then summing a sufficiently large number of initial terms up, the convergence rate being 
also exponential. Then we solve successively the corresponding system of equations by using 
an increasing number of terms in the partial sums; see next section for details. 



Asymptotics of From the expansion ([7]) and suitable analytic continuation to be clarified 
below, we deduce our main result for p„ . 

Theorem 1 The probability pn that two randomly chosen binary search trees of n nodes are 
equal satisfies the asymptotic expansion 



p„~p-"-M 6n + ^ + 5^n-^+i J2 Q/(logn)M, (10) 

\ j>6 0<^<Li/6j / 

for explicitly computable constants Cj^e, where p = 3.14085 75672 02936 95160 . . . 
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Thus p 7^ TT. In particular, the first few terms read 

336 1008 10416 

Pn = P~'"' \bn + — + TT-— ^ + TTTTT^ + 




3125^5 3125^6 15625 

8234352 12228048 

+ 



4296875 n9 4296875^1° 
1 /9483264 5621191632 677376 \ ^ flogn 



V5078125 726171875 1625 J \ n^"^ 

where Hn ■= J2i<j<nj^^^ ^^^^ terms of the form cn^^ with j = 1, . . . , 4 

appear in the expansion. Numerically, the parameter cq can be determined approximately as 
Cg = -0.00150 84982 09405 93425 . . . ; see the numerical discussions on PageOfor details. 

As far as we were aware, the asymptotic expansion (flOl) with missing terms is rare in the 
analysis of algorithms and applied probability literature. The expansion also indicates that the 
approximation of PnP""^^ by the first two terms 6n + 18/5 is numerically very precise as can 
be seen in Figure [21 

Features. In addition to the unusual form of (flOl) and its theoretical value per se, the inter- 
est of such a psi-series expansion is multifold. First, since no analytic form for the movable 
singularity p is available, the psi-series expansion provides an effective means for obtaining 
an approximate value to p by the argument we mentioned above; see ((22l) below for more nu- 
merical details. Second, from a methodological point of view, the method of proof we use to 
prove (flOl) is of some generality. Note that the first two terms on the right-hand side of (flOl) 
can be easily obtained by the method of matched coefficients once we assume that pn has the 
form (flOl) . Third, the precise approximation we derive has direct consequences in the original 
motivating problem, as well as several others in the examples we discuss below. Fourth, such 
a consideration leads to several interesting and unexpected phenomena as we will see in the 
following sections. 



Outline of this paper. We describe the psi-series method and give the proof of the asymp- 
totic expansion (flOl) in the next section. Then we extend in Section [3] the consideration of the 
probability of equality to either more than two random BSTs or to other variants of BSTs. It 
turns out that the forms of the asymptotic expansion for the probability of equality of d random 
BSTs differ drastically according to the parity of d, a result not intuitively obvious. Section [X2] 
considers the case of two random m-ary search trees and we will see that the number of miss- 
ing terms in the asymptotic expansion increases as m grows. Equality of two random fringe- 
balanced BSTs is considered in Section [33] and there, unlike m-ary search trees, the error term 
beyond the constant term in the asymptotic expansion does not change with the structural pa- 
rameter once it exceeds one, another unexpected result. Asymptotics of higher-order moments 
will then be considered in Section [4] with a few representative examples taken from the cost of 
partial-match queries in random trees, random partition structures and solutions of Boltzmann 
equations (from statistical physics). We group the details of some proofs in Appendix. 



Notations. For each problem studied, p always denotes the dominant singularity of the asso- 
ciated nonlinear DE and Z := 1 — z/ p. The symbols c, c' , Cj, c'j, Cij, 
denote suitably chosen constants, not necessarily the same at each occurrence. 
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2 Psi-series method 



We discuss in details the psi-series solution to our nonlinear DE ([3]) and the tools needed to 
justify it, then we prove (fTO)) . 

Analytic properties of P{z). First, the solution P{z) to the DE ^ has positive radius of 
convergence and is analytic at the apparent fixed singularity 2; = by definition. By simple 
induction as we discussed in the introduction (Section [T]) and Pringsheim's theorem (since 
all coefficients p„ are positive; see [19, p. 240]), we expect that P{z) has a finite movable 
singularity at, say z = p, and the asymptotics of p„ will be dictated by the local asymptotic 
expansion of P{z) as z ~ p. 

Martinez ll27l p. 117] proved that the function P{z), originally defined only inside the disk 
\z\ < p can be analytically continued to the cut-disk \z\ < p + e \ [p, p + e] with p being the 
sole singularity there. 

From a theoretic point of view, the movable singularity p for the DE ([3]) can be either of the 
following types: 

• poles, 

• branch points (algebraic or logarithmic), 

• essential singularity. 

Simple poles and algebraic points are first excluded because of the above trial via Frobenius 
method. We then show that P can be analytically continued into a function defined by a series 
expansion of the form (|7]) that converges absolutely in the cut-region 

'^r:={z : 0<\z~p\<R,z^[p,p + R]}, (11) 

for some R > 0. Thus the possibility that p is an essential singularity is further excluded, and 
p is a logarithmic branch point (or called pseudo-pole). 

Our first focus in this paper is on the determination of the right form of the solution to 
([3]). More detailed and complete introduction and discussions on the theory related to Painleve 
analysis can be found in [9, JJJ and the references therein. 

The ARS method (Type checking). A widely used procedure to check the singularity type 
(and the local expansion) of nonlinear differential equations is the following procedure, often 
called the ARS algorithm due to Ablowitz, Ramani and Segur [[0, which bears some resem- 
blance to the Frobenius method. 

In this method, we start assuming that the solution to the DE ([3]) admits the formal Laurent 
expansion (U) about the cut-disk for some positive number R. 

O Leading order analysis: Assume P{z) ~ co(l — z/ p)~°'. By balancing the dominant 
terms pP"{z) and P{z)'^ in Q, we see, as in Frobenius method, that a = 2 and the 
companion constant cq = 6/p. Thus we can exclude the possibility of an algebraic 
singularity. 
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@ Resonance analysis: Starting from this pair (a, cq) = (2, 6/p), if the solution admits only 
poles, then by substituting ([H) into ^ and by equating coefficients, the coefficients c/s 
are characterized by the recurrence relation of the form 

*(i)ci = (j - 3)^Cj_i + P ^ CjCn-j =: Gj(p, Co, ci, . . . , Cj_i), i > 1, (12) 

l<j<n 

where $(j) = (j + l)(j — 6) and Cj = for all j < 0. The roots of are called 
resonance and —1 is always a root of reflecting the arbitrariness of the movable 
singularity p. For most of our purposes, a less involved and very commonly used tech- 
nique is to substitute the test function 

Co(l-z/p)-" + c,(l-z/p)'^-" 

into the DE ([3]) instead. By collecting the coefficients corresponding to the term 0^(1 — 
z/ py~^, we still get the same a, cq and $(r). In this case, we see that $ has only one 
positive resonance 6 that needs to be further examined. 

© Compatibility: Once we have the system (fT2l) and identify the resonance, the next step 
is to consider its solvability. Obviously, (01) is the solution to ^ if and only if all the 
coefficients c^'s can be computed recursively by (fT2l) . This fact defines the compatibility 
of the resonance: for any resonance r of $, if Gr{p, cq, ci, . . . , Cr-i) = is satisfied, 
then the resonance r is said to be compatible; otherwise, r is incompatible. 

From ([5]) and Q it follows that r = 6 is incompatible. The formal series solution by 
introducing suitable logarithmic terms starting at the index 6 has to be considered instead 
(see dH])). The movable singularity p to ([3]) is proved to be a logarithmic branch point 
since we will show that the associated series solution is absolutely convergent in the 
region for some i? > 0. 

In cases when the compatibility of resonance is consistent, the solution of Laurent expan- 
sion is the one we need if it has a positive radius of convergence. The above ARS Algorithm 
is useful in determining if a nonlinear ODE admits the Painleve property, namely, the DE has 
only solutions free from movable branch points. In our case, the DE ([3]) does not satisfy the 
Painleve property. 

Our approach vs the ARS algorithm. The method of proof we use does not, however, rely 
completely on this method for two reasons. First, it requires the a priori information that p is 
not an essential singularity, a property often hard to prove. Second, even we can prove that the 
singularity is not essential, the incompatibility of a resonance (or several) may in some cases 
very difficult to establish due to the variation of an additional parameter as in the cases of d 
random BSTs (Subsection l3.1l) and m-ary search trees (Subsection l3.2l) . 

On the other hand, the ARS algorithm does provide an effective means of computing the 
exact form of the psi-series expansion for all the examples we discuss, notably the characteri- 
zation of the resonance. We will thus use the ARS algorithm for two purposes: first, when the 
resonance equation has no positive integral resonance or when all resonances are compatible, 
then the solution is given by a Laurent expansion; second, when Laurent expansion fails, we 
use the ARS algorithm to guess the possible form of the psi-series expansion we are looking 
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for, and then the proof will be conducted along the same way we do for Of course, there 
are also cases for which the ARS algorithm can be easily justified and the singularity is not 
essential (say, by the absolute convergence of the psi-series). 



Absolute convergence of the psi-series. We now prove that U (Z) converges absolutely in a 
cut-disk for some positive R> 0. 

Proposition 1 For each fixed Cq, the psi-series expansion (H)) converges absolutely for z in the 
cut-disk '^(i_£)p (defined in 071)). where e > is a small number 

The range l^; — p| < (1 — £:)pis the best that our approach can achieve although it seems to hold 
true, by numerical evidence, up to |2; — p| < p; in particular, this suggests that the psi-series 
expansion be convergent even for Z = 1 or z = for P{z). 

From this proposition, we see that the solution P{z) can be analytically continued to at least 
the region 

{{z : \z\<p + e]VJ{z : \z - p\ < (I - e)p]}\[p,{2 - e)p\ > 0), 

from which we deduce (flOl) . 

To prove Proposition [H we adopt an approach due to Hille [l22l with some new ingredients; 
see also [21]. The resulting proof can then be extended to cover all the types of DEs we discuss 
in this paper, whatever their orders. 



Proof of the absolute convergence of the psi-series. I. Recurrence of Uk. We first rewrite 
the DE ([3]) for P into that for U, which becomes 

{{l-Z)U\Z))' = pU{Zf. 

For convenience, let Uq = pU . Then 

{{l-Z)%{Z))' = U,{Zf. 

As in [|2ni . we then convert this DE into a first-order differential system by introducing an 
additional function Vq := (1 — Z)Uq(Z) as follows. 

^o^^^-l_Z' (13) 
V^iZ) = U,{Zf. 

Let r = log Z, Uo{Z) = ^^^q Uk{r)Z^-'^ and Vq{Z) = ^^.^^ ^;fc(r)Z^"■^ where Uk and Vk are 
polynomials in r of degree at most [/c/6j. Note that (dr)/(dZ) = Z^^ and cq = 6/p. From 
(fT3l) . we derive an infinite system of equations in k {iik := u'f^{T)) 



Uk + {k - 2)uk = f fc + 

^^'^ ik>7). 
Vk + (k - 3)vk = 12uk + 2^ UjUk-j, 

l<j<k 
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We can further express the above system in terms of matrices as follows. Let 
(f>k-=(^.''),^k-=(^,^ , ) > and gfe : 



0<j<k 
V l<j<k ) 



Then, for k > 7, 

(f)k + Ak(f)k = gfc, 

which can be explicitly solved. 

Lemma 1 For k > 7, (pk admits a unique solution satisfying 



(14) 



of the form 



Hm ||e^'=-0,(r)|| =0 



Pe-^'°p-igfe(r-x) dx, 



(15) 



where D := 



k + 1 
k-6 



1 1 

-3 4 



and P 



-1 



7 7 



Proof. The fundamental matrix solution associated with the homogeneous part of (fT4l) is e""^*^, 
so we can solve (fT4l) by multiplying it by e'^^'" and then by using the fact that Mfc(r) and t'fe(r) 
are polynomials in r, which gives 



Integrating both sides from — oo to r, we get 



or 



= e^^'^-0fe(r) = r e^^'^gfe(a;)da;, 

— oo -J — oo 



0feW= / e(^-^)^'=g,(x)da;. 



The lemma then follows by a change of variables. I 

Proof of the absolute convergence of the psi-series. 11. An estimate for Uk. To estimate 
the growth order of Uk and Vk, we now introduce the following norm: for any x G C" and any 
matrix (a,; 



'3 ' nxn^ 



Ixll = max 
i<i<" 



max 

i<i<" 



"ij I 
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With this norm, we then have the inequaUty 

max{|Mfe(r)|, |^^fc(r)|} < \\(j)k\\ 



< 5 / e-''^''-^^ max 



7° 

Jo 



^ l^ji X] f (1^) 

,0<j<fe l<j<fe 



Now write z — p — re*^, so that r = log(r/p) + i9 = ^ + i9, where r < e and 

^ := + : ^ e (-00, -s] and |^| < vr} , 
with 1 1 — r I > 1 + £. We prove by induction that 

K\l - T\ 



\k-6 



Wk{r)\ < 
\Mr)\ < 



VkTT ' 
Vk + T ' 



(17) 



for /c > and t e where the constant K > Q is easily tuned according to the initial 
conditions. 

Then, by induction hypothesis. 



0<j<A; 



< 



|l-r 



|i-6 



< 



0<j<A; 
K 



1 -rl - 1 

fe-6 



|l-r 



A:-6 



< — 1 -r 



and 



l<j<k 



< K^\l-r 



fc-12 ]_ 



k-12 



1 



\/x{k — x) 



dx 



= 7rir^|l-r 
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Now 



max{\uk{T)\,\vk{T)\} < !|0fc(r)|| 

f Pe(^-^)Dp-ig^(x)dx 

J — oo 



— oo 
oo 



Pe-^°p-^gfe(r - x)dx 



< IIP|IIIP"^I 



-x{k—6) I 



igfc(T -a;)||da; 



< 5 / e-^^'^-^) max 



^0<j<k 



\Vj[T-X) 



Yl - ^)'^k-j{r - x)\ \ 

<j<k ) 



6x. 



l<j<k 

By choosing e < 1/{'kK), so that K/e > nK'^. We have 

\uk+6{r)\,\vk+6{r)\< / e-^'=|l-r + x|'=dx 

£ Jo 



5K, 
ke 



< ^11 -rl'^ / e"^ 



1 + 



X 



k{l - r) 



dx. 



Since 1 1 — rl > 1 + £ for r e 5^, we see that 



1 + 



X 



k(l-T) 



dx < 







1 + 



x 



k\l-T\ 



dx 



Jo 

II -rl 



< 



|1 - r| - 1 
l + e 



It follows that 



5K, 
ke 



II -rl 



1 + 



X 



k{l-r) 



dx 



^5^^(l + e) , 



< 



ir|l -r|*= 



Vk + 7 ' 

for k > ko> —7 + {1 + zf' je'^. This proves the required estimate. 



(18) 
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Proof of the absolute convergence of the psi-series: an estimate for U{Z). From (flTI) . we 
obtain 



pmz)\ 



k>0 



= 0(e-2«W (l-|l-r|e«W) 
= 0(1), 

provided that 



|l_^|e«M<i. 



But this implies that (3f?(r) = r/p) 



This proves that the series ([8]) is absolutely convergent for z E ^(i^s)p- I 

Numerical approximations to p and ce. As mentioned in Introduction, P is connected to 
U by choosing a point in [ep, p — e]; then the values of (p, cg) are determined by solving 
numerically the two equations P{zo) = U{Zq) and P'{zq) = —pU'{Zo), where Zq := 1 — zq. 

For numerical purposes, we can compute the approximate values of P{zq) or P'{zq) by 
their corresponding truncated series expansions using, say the first N terms; for example, 
P{zq) ~ Ylij<NP3'^i- number of terms used depends on the degree of numerical preci- 
sion we require, and the remainder J2j>NPj^^ '^^^ ^e well estimated by using the asymptotic 
expansion (fTOl) . More precisely, for large A^, 

Since zq < p, the right-hand side can be made arbitrarily small by choosing sufficiently large 
so that the error introduced is under control. 

Similarly, U (Z) ^ Um{Z) := p^^ J2k<M '^k(Jog Z)Z^~'^ for a sufficiently large M whose 
choice can be determined by the desired degree of precision and the upper bound (fTTl) . 

Mfc(ro)e(^-2)-o = O (M-i/2|l - ro|^^e^^«(^°)) , (20) 

k>M 

where tq = log(Zo). 

Note that if zq is too close to zero, then the remainder (fT9l ) for P decreases much faster than 
that (l20l) for U, and if zq is too close to p, then the converse is true. So the best choice for zq 
will be the one that both remainders are asymptotically of the same order. For practical use, 
since pn is easier to compute than ut, we take M = (5N for some /3 e (0, 1). Then we solve 
the equation 



l-log(l-^ 

P 



1 - - 1 , (21) 

P 
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(which obviously has a unique real solution for zq/ p E (1/2,1)) to find the best zq. 

On the other hand, to compute Uk, we take the first entry of 4>k in (fTSi) and obtain the 
recurrence 



1 



oo 



Uk{T) = - I (Se-^'-*^)" + 4e-(''+i)") {{k - 3)Mfc_i(r - x) + ^^.^(r - x)) dx 







+ - / (e-^'^-^)" - e-('=+^)") V Uj{T - x)uk-j{r - x) dx 

= Uk-iir) + - / {9e-^''-'> - IGe-^'^+i)^) u^-iir - x) dx 
' Jo 

+ 3 / (e-(^-<^)" - e-^'^+i)") V M,-(r - x)uk-j{T - x) dx, 
for k > 7. All these polynomials m^'s are solvable recursively starting from the initial values 

Mo = 6, Ml = g-, M2 = — ^, Ms = — Y25' "^4 = "iffo' ""5 = "9^5 Uq = Ce — 3^25, 

with the two free parameters p and cg. More explicitly, let ^^(t) := J2o<s<ik/6\ '^k,sT^- Then 
«M-«.-M + ;y 2. l^7(A:_6)^-+i-7(fc + l)^-+iJ 



s<£<[(fc-l)/6j 



+ 1! ^, n,AM,_,,,,(-l) (^7(^_6)^,+..-. 



i<j<k 

o<ii<lj/e\ 

0<£2<L(fe-j)/6j 



+ 1 " 7{k + 1)^2+^2-^+1 



forO < s < [k/6\. 

We finally solve numerically the pair (p, ce) from the two equations with p G (3, 4) 

PNizo) = UMiZo) and P'j^{zo) = -pU'j^jiZo). (22) 

Numerical evidence suggests that the series definition for U (Z) and U'{Z) are both conver- 
gent for Z = 1, which means that one might even use the two equations 

U{1) = 1, U'{1) = -p, 

to solve for the pair (p, cg). But the convergence is much slower than taking zq according to 

GD. 



A quantity arising in phylogenetic trees. Very similar to the original motivations of study- 
ing pn, the following recurrence 

^" = (n - IP > 2), (23) 

^ ' l<j<n 

with qi = 1 was introduced in Bryant et al. [T| in the course of analyzing the size of a maximum 
agreement subtree in two randomly chosen trees according to the Yule-Harding model. The 
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quantity serves as an effective bound for the probability that the size of a common maximum 
agreement subtree exceeds a certain given value. 

Let pn ■= 2g„_|_i. Then the recurrence (|23l) becomes 

0<i<n 

of exactly the same form as ([T]) but with po = 2. This means that the DE satisfied by the 
generating function P{z) = YlnPn^^ remains the same as ^ but the initial condition differs. 
The same psi-series method we used above applies and we obtain the asymptotic expansion 

/ 6 168 _5 336 _6 ^, 

qn = P 3n \ n ' H n ^ + 0(n ^) . 

^" V 5 3125 3125 ^ 7 

with p = 1.57042 87836 01468 47580 40837 .... 

3 Probability of equality of random trees 

The consideration of the equality of two random BSTs can be easily extended either to more 
random BSTs or to other variants of BSTs. 

3.1 Equality of d random BSTs 

We extend in this subsection the same psi-series analysis to d random BSTs, d > 2. Surpris- 
ingly, the resulting forms of the asymptotic expansions depends on the parity of d. 

Recurrence. The random BST model is as introduced above. Let pn = Pn{d) denote the 
probability that d random BSTs, each independent of the others, are identical. More precisely, 
the probability that d random permutations whose corresponding BSTs are all the same. Then 
Pn satisfies the recurrence 

Pn = n^'^ 5Z PjP^~^-i ('^ > 1)) (24) 

0<j<n 

with pq = 1. Let P{z) := J2n>oPn^^ '•^^ generating function of pn- Then P{z) satisfies the 
nonlinear DE of order d 

(^z^y Piz) = zP{zf (25) 

with Po = 1 and the first d — 1 values p„ for 1 < n < d given by the recurrence ((24l) . 

The ARS Algorithm. As in the case of two random BSTs above, we begin with applying the 
ARS Algorithm and check first if there are pseudo-poles and incompatibility. 

O Leading order analysis: This part is always easy for the problems we study in this paper 
and we obtain, by assuming P{z) ~ co(l — z/p)'" and by matching coefficients, a = d 

and Co = p{2d)\ / {2d\) . 
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@ Resonance analysis: On the other hand, by collecting the coefficient for the term Cr{l — 
z/pY~'^'^ in the resulting expansion for (|25l) . we obtain the polynomial characterizing all 
possible resonances 

$.(r) = ('^-'-^y- _ Ml (26) 
^ id-l-r)\ d\ ^ ^ 

(r + l)0rf(r), rfisodd; 
(r + l)(r — 3(i)0rf(r), (i is even, ' 

where is a polynomial of even order and has no real zeroes. We see that if d is odd, 
then there is no additional integer- valued resonance except —1 for this case. Thus, the 
movable singularity p is a pole of order d. On the other hand, if d is even, then there 
exists an additional, unique, positive, integer-valued resonance 3(i for each d. 

© Incompatibility: We need only consider the case when d is even. The incompatibility of 
the resonance at r = 3(i is easily checked for each specific (i = 2, 3, . . . , but a proof that 
r = 3(i leads to incompatibility for all d is not obvious. 

The case when d is odd. From the above quick check by ARS algorithm, we see that the 
solution for the DE (l25l) admits the Laurent series expansion 

where '^{z) = 'Bd{z) is analytic at p. 

The case when d is even. By the above procedure of ARS algorithm, we anticipate a psi- 
series expansion for P{z) of the form 

pp{z) = j2z'"' Yl c,/(log^)^ (27) 

j>0 0<£<Li/3dJ 

where the cj/s are chosen so that the psi-series satisfies the DE (|25l) . In particular, the first few 
terms read 

oP(z)-^^'^yZ-'^ i^d-2){d-l){2d)\ ^ + 

- 2^.^ A{3d-l)d\ ^ + ^ "^'"^ + ^''''^ log Z + ■ • ■ . 

2<j<3d 



The justification of the psi-series on the right-hand side of (|27|) follows the same pattern as that 
for two random BSTs; see Appendix Al for details. 

In summary, we conclude the following asymptotic estimates, the drastic change of the 
error term according to the parity of d unveiling an additional surprise. 

Theorem 2 The probability that d > 2 randomly chosen BSTs are all equal satisfies 

^"-^ {d-iy.^ r + 3d-i ^ + 



+ 



0(p-"(l -£)"), if d is odd; 

Kn'^'^-^ p-''-^ + O [p-^'n-^'^-^) , ifd is even, 
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where e > 0, the Cj 's are constants, p = Pd depends on d and K is a constant depending only 
on d. 

More precise asymptotic expansions can be derived, but we content ourselves with the current 
form for simplicity of presentation. Is there any intuitive reason why the asymptotic expansion 
of Pn = Pn{d) differs according to the parity of dl 

3.2 Equality of two random m-ary search trees 

The m-ary search trees are one of the natural extensions of BSTs to branching factors m > 2 
beyond binary; see [|25ll for thorough discussions. Briefly, the first m — 1 keys are stored in the 
root and sorted in increasing order, each of the remaining n — m + \ keys are then directed to 
one of the m subtrees, corresponding to the m intervals specified by the m — 1 sorted keys, and 
are constructed recursively by the same procedure. 

In the same vein, the probability g„ that two random m-ary search trees are identical is 
characterized by the following recurrence (m > 2) 

= f J Yl ■■■(Ijn. in>m- 1), 

^ ^ iiH \'jm=n-m+l 

JlvJm>0 

with the initial conditions qj = I, < j < m — 2. The associated generating function Q{z) 
then satisfies the following nonlinear DE 

with the initial conditions Q{z) = 1 + z + ■ ■ ■ + z™"^ + g^.iz'""^ + ■ ■ ■ where qj, m — 1 < 
j < 2m — 3, are determined by the above recurrence. 

O Leading order analysis: The simple form Q{z) ~ co(l — z/ p)~°' leads to a = — 2 and 

pco = ((2m - l)!/(m - ly.^f'^'^-^l 

@ Resonance analysis: Again, assuming that Q{z) ~ co(l — z/ p)^"^ + q.(1 — z/ p)~'^^'^ , we 
obtain the following algebraic equation characterizing all possible resonances 

n - - = (r + l)(r - (2m + 2))</.„(r) = 0, 

2<i<2m 

where (f)m{r) is a polynomial of degree 2(m — 2) and admits complex-conjugate zeros 
only. Thus we need to check if the DE (1281) is compatible at the resonance r = 2m + 2. 

© Incompatibility: Similar to the case of d random BSTs, the resonance r = 2m + 2 is 
easily checked to be incompatible for each finite values of m = 2, 3, ... , but it is far 
from being obvious to prove directly the incompatibility for all m > 2. 

Let Am := ((2m — l)!/(m — l)!^)^^*'™ ^\ Instead of proving the incompatibility of r = 
2m + 2 for all m > 2 and that p is not an essential singularity, we prove that the DE (l28l) has 
the psi-series solution 

j>0 0<e<lj/{2m+2)\ 
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TTl 


Pn ~ 




o 
z 


X2P2 1^+5 + 8125^ ) 


D 


3 


\ f'-n 1 4 1 6927696 „-7^ 
A3P3 + 7 + 78236585 ^ J 


V30 


4 


\ 9""-^ fT7 1 ^ 1 10419284224 -9) 
MPi l"- + 9 + 15568564095 ^ ) 


v^MO 


5 


\ 1 6 1 1526061507281984000 „-ll\ 
^^5/^5 ^^"^11 "T 194179984589469879 ) 


v^630 


6 


\ n^"^^ ('t? 1 1 132275788517112977050000 „-13\ 
'^6^'6 {"'^1?.'^ 942913507718961369877 ) 


v^2772 



Table 2: The asymptotic approximation to the probability that two random m-ary search trees 
are equal for m = 2, . . . , 6. All 0-terms are omitted. 



which converges absolutely in some cut-region 'ion (defined in (fTTI) ): see Appendix Al for 
details. Then we connect Q[z) and U (Z) by the same arguments as those used above for two 
random BSTs. In this way, we obtain 

2<i<2m+2 

+ C2m+2,1^''" log Z + O log Z) . 

From this expansion, we then derive the following approximation to g„. 

Theorem 3 The probability = qn{^) that two random m-ary search trees are equal satisfies 
the asymptotic approximation 

qn = Xmp-"--' (n + + Kp—'n~'-' + O {p-n-'-') , 

where p = Pm and K both depend on m. 

As for BSTs, the consideration can be extended to choose d > 2 random m-ary search 
trees, and the resonance equation is given by 

nm{dm — 1)\ T{d — r + d{m — 1)) m{dm — 1)\ 
^d-r + j)- = Y(d-r) (d-l)\ ■ 

0<i<d(m-l) ^ ' ^ ' ^ ' 

We then deduce that this equation has no positive integral resonance when m is even and d is 
odd, and has the positive resonance d{m + 1) for all other cases with d,m > 2. Our approach 
can be applied and we obtain an asymptotic approximation to the probability that d random m- 
ary search trees are equal, the error terms beyond the constant term being either exponentially 
small when m is even and d is odd or of order x ^^-'^'^-i for all the remaining meaningful 
cases. 



3.3 Equality of two random fringe-balanced BSTs 

Median-of-(2t + 1) (or fringe-balanced) BSTs represent yet another class of extensions of 
BSTs. The idea is, instead of placing the first element in the given sequence at the root, which 
may result in a less balanced binary tree, we take a small sample of size 2t + 1 and use the 
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median of this sample as the root element, which then partitions the remaining elements as in 
the construction of BSTs, where t > 0. This simple balancing scheme has turned out to be 
useful for small t, notably for the corresponding quicksort algorithm. Note the the original 
BST corresponds to t = 0. 

For the probability model, assume, as in random BSTs, that we are given a random permu- 
tation; then we construct the corresponding median-of-(2t + 1) BST, which is called a random 
median-of-{2t + 1) BST. 

Let now /„ = /„(t) denote the probability that two randomly chosen permutations lead to 
an identical median-of-(2t + 1) BST. Then /„ satisfies the recurrence 

f-= E ( n % /^/"-W (n>2t + l), (29) 

t<j<n-l-t \2t+l) 

with the initial conditions /„ = 1 for < n < 2t. 

Let F{z) := J2n>o Z^-^" denote the generating function of /„. Then F{z) satisfies the DE 

_ Pi_Ll)!! ((,.f,„)«') (,))^ (30) 

with the initial conditions F^^\0) = j\, < j < 2t, and fj, 2t + 1 < j < At + 1, given by the 
recurrence (|29l) . 

O Leading order analysis: With the simple form F{z) ~ co(l — z/p)~°', we obtain a = 2 
and 

_ (4t + 3)!t!4 
^""^ ~ (2t + l)H ' 

for each t > 0. 

@ Resonance analysis: Again, assuming that F(z) ~ co(l — z/ p)~'^ + Cr(l — z/ pY'^^''', we 
obtain the resonance equation 

Mr)=i n (^-^'M ( n (^-^■)-2 n ^ 

\2<i<2t+l / \2t+2<j<4i+3 2t+2<i<4i+3 

which can be factored into the form 

(r + l)(r-6t-6)0t(r) JJ (r-j), 

2<i<2t+l 

where 0t(r) has only complex conjugate zeros since the factor 

(r - 2t - 2) ■ • • (r - 4t - 3) - 2(2t + 2) ■ ■ ■ (4t + 3) 
= (r - 2t - 2) • ■ ■ (r - 4t - 3) - (2t + 3) • • ■ (4t + 4) 

never vanishes forr G — l,6t + 6}. Thus we get yet another new pattern for the 
least positive integer- valued resonance 
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© Incompatibility: As t = has already been addressed in Section [2l we focus on t > 1, 
which has the constant resonance r = 2. A direct check of the incompatibility is possible 
for r = 2 and t > 1; see Appendix A2. 

The same psi-series method applies and we obtain for t > 1 



_ (4t + 3)!t!^ / _2 _ 2(t + l)^ ^_i (22t^ + 35t + 14) {t + if t 

(2t+l)!4 6t + 5 ^ (7t + 6)(6t + 5)' 

+ 0{\Z\\\ogZ\)). 



Theorem 4 The probability /„ that two random median-of-{2t + 1) BSTs are equal satisfies 
the asymptotic approximation 

_(4t + 3)!t!^ -n-i 3 + 2t-2t2 (22^^ + 35t + 14) (t + 1)^^ 

{2t + l)\^P y'^^r^h (7t + 6) (6t + 5f 

+ O (p-"n-2) , 

/or t > 1, where p = Pt is an effectively computable constant 

Note that the expansion also holds when t = but the 0-term becomes 0{n~^); see (flOl) . Also 
more terms can be computed by the same procedure. 



4 Moments of high orders 

In addition to the equality of random trees, another rich source where nonlinear recurrences 
and differential equations of the same type as we analyzed above arise is the asymptotics of 
moments of high orders. 



4.1 Partial match queries in random quadtrees 

We consider first in this section the cost of partial match queries in random two-dimensional 
quadtrees. The expected cost was first analyzed in ifTSl (see also [8]) and the limit law derived 
in [|30ll under an idealized model where randomness is preserved throughout the tree. 

Let V = (vTT — 3)/2. Then the cost of a random partial match query in a random two- 
dimensional quadtree of n nodes tends (under an idealized model where randomness is pre- 
served for all subtrees), after normalized by n^, to a limit law X whose moments satisfy (see 

m) 



where ai := T{2v + 2)/{2T{v + 1)^) and 



T{mv + 1) 



v{m — l)((m + l)v + 3) 



l<j<m 
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Then the generating function A{z) := 1+X]m>i cim.z™' /m). satisfies the differential equation 

v^z^A'{z) + 2zA{z) + 2A{z) = 2A\z), (32) 

with the initial conditions A{0) = 1 and ^'(0) = ai. 

The psi-series method we use above can be readily applied with the resonance r = 6 and 
we obtain 

=S£' (33) 

468(153. + 545) 

109375 ^ VI I I i; ' 

where the Cj's are unimportant constants. By singularity analysis ( IfTTll '). we conclude the fol- 
lowing asymptotic approximation to a„/n!. 

Theorem 5 The m-th moment ofX satisfies for large m 



m\p-^ / 2 9 1404(39i; + 139) 
E(X'") = — — ?,v^m + -V ^ ' 



Vimv + 1) V 5 2185 m 



5 



8424 139i; + 495 ^ , 7, 

H ^ 7, + O [mr^] 

21875 ^ ^ 



(34) 



where p ^ 1.37649 44410 57156 25755 



We omit all details as they are very similar to the case of the equality of two random BSTs. 

An interesting implication of our psi-series analysis is that we can derive an asymptotic 
expansion for the moment generating function of X 

as 1^1 — )■ 00 in the sector | aTg{z) | < (f — e)n/2. This is proved by the integral representation 

E(e^") = -^/ e's-'A{z/s^)ds, 



2Txi 



for a suitable Hankel-type contour, and standard analysis; see Appendix A3. Such an expansion 
for the moment generating function is unusual in the probability literature and implies in turn 
that 

-logP(X > t) ~ (1 - ^;)W(i-'')(pt)i/(i~^), (36) 

for large t, by an application of Tauberian argument; see Section 4.12 of Bingham et al. [[5l|. 

Note that the transformations z = and A{z) = 2^Z{^) brings the DE (132]) to the 
standard form of the so-called Emden 's equation 

But it is not exactly solvable; see flQ", § 2.3] or |l22l § 12.4]. 
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4.2 Partial match queries in random relaxed k-d trees 

In a similar setting, the cost of a random partial match query in a random relaxed k-d trees (see 
ifTH ) tends, after proper normalization, to the limit law Y whose moments satisfy (see f2M ) 



r(m/3 + 1) 



where /3 := (—1 + a/9 — 8s/A;)/2 (s out of the k coordinates in the query pattern is specified, 
the other k — s being "don't-cares"), and 



i<i< 

with 



2r(2/3 + 2) 



+ 1)2(2/3 + l)r3(/3 + l)- 

It follows that the generating function B{z) := 1 + Xlm>i b^z"^ /m\ satisfies the nonlinear 
differential equation 

Pz'B'Xz) + {13 + lfzB'{z) + {13 + l)B{z) = {13 + l)B^{z) + /3{/3 + l)zB'{z)B{z), (37) 

with the initial conditions -B(O) = 1 and B'{0) = bi. 

The psi-series method applies with a resonance at r = 2 and we obtain the expansion 

^ 2 7-1^/3-1 ^ , 2(/3-l)(/3 + 2) ^, 2 

from which we deduce an asymptotic approximation to higher order moments of Y. 
Theorem 6 The m-th moment of the limit law Y satisfies 

■^<-) ^ w^mrr, - ^^^^ - - « • 

as m CO, where p depends on (3. 

Consequences of this expansion can be derived as those for X. 



4.3 Recursive partition structures. 

In the context of recursive interval splitting, Gnedin and Yakubovich [l20ll derived the following 
recurrence relation for the m-th moment hm of certain limit law W (satisfying a fixed-point 
equation with Dirichlet distribution as prefactors) 

= n.fltnxtl + ^) ^ + -)r((m - j)A + .)h, (38) 

for m > 2 with ho = hi = 1, where A, w > (A is referred to as the Malthusian exponent) and 
d = 2,3,.... 
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The case when d = 2. Consider first tlie simplest case when d = 2. In this case, the generat- 
ing function 

ftW E '"'".rr r' (39) 

m>0 ^ ' 

satisfies the DE (using the relation (A + a;)(A + + 1) = 2uj{uj + 1)) 

vz^h"{z) + zh\z) + h{z) = h\z), 

which is exactly of the type of problems we have been examining in this paper (cf. (I32l)'). where 
for simplicity 

^ uj{uj+ !)■ 

For this DE, we can apply the psi-series method and obtain {Z = 1 — z/ p) 

h{z) = QvZ-^ -^{Qv -l)Z-^ + CjZ^-^ + KZHogZ + {\Z\^\logZ\) , 

2<i<6 

where 

^ __{v- lyiv - 6){6v - l){2v + 3){3v + 2) 
■~ 43750^5 • 

Consequently, we deduce the asymptotic expansion for the moments of W 

hm = , ^ , vm AKm ^ + O {m ^) ] , 

r(mA + t<;) \ 5 / 

for large m. 



The case when d > 2. From the recurrence (l38l) . the generating function h(y) (defined as in 
(|39l)) satisfies the DE 

where co'^ = to ■ ■ ■ {u + d — 1) denotes the rising factorial; see [|20l . The DE is however 
less manageable. We rewrite it as follows. Let z = and H{z) = z'^h{z), where k := 
{d + Lu — 1) / X. Note that the Malthusian exponent A satisfies the relation 



(X + cuY 2' 

Then the function H{z) satisfies the DE 

xe{xe -i)---{xe-d + i)h{z) = z-^u/Hizf, (40) 

where the differential operator 9 is defined as 6* := z{d/dz). 

The leading order analysis and the resonance analysis give the dominant exponent —d and 
the resonance equation is exactly the same as (|26|) for all d > 2, namely, {d — rY — {d+ lY. 
It follows that we have the same asymptotic pattern for H as the case of d random BSTs. 
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The case when d is odd. The movable singularity p is a pole of order d and the solution H{z) 
admits the Laurent expansion 

where 

d (4rf-2)a; + (rf-l)(5rf-2) 
'^-~2 2(3rf-l)A ' ^^^^ 

and Si(z) is an analytic function at ;s = p. 

The case when d is even. In this case, since the resonance equation (l26l) possesses the unique 
positive integral resonance 3(i, we see that z = p is a pseudo-pole and the psi-series solution to 
(I40l) has the form 

= E c,^^-^ + ^^^'^log^ + 0(|Zr^+^|logZ|), 

where, in particular, cq and ci are given as in (|4TI) . and is a constant dependent on A and u. 

Expansions for h. It is not difficult to verify that h{z) and H{z) have the same dominant 
singularity p, dominant exponent —d, and the dominant resonance 'id. Now by the relation 
between h{z) and H{Z): h{z) = (1 - Z)~'^p~'^ H{z), we obtain 

E CjZ'^"^ + ^2{z), if is odd; 

o<i<d 

E c^.Z^-'^ + ir'Z^'^logZ 
o<i<3d , if is even, 

t +0(|Z|2^+l|l0gZ|) 



, , , (2d)!A'^ . 
hiz) = - — - — = X < 



where Cq = 1, 



and S2 is analytic at z = p. 



, _d fd + 2uj-l 
^^~2 V (3rf- 1)A ~ 



Asymptotics of the moments. From the expansions we derived and a similar analysis as for 
d random BSTs, we can now conclude the following asymptotic approximations to the limit 
law W. 

Theorem 7 The m-th moment hm ofW satisfies 

^ (2d)\T{ujf\'^m\p-"^ ^ 

2.rf!(rf-l)!r(c. + rf)r(mA + c.)^|^^^^-^ 



+ 



0((l-e)™), ifd is odd; 

Cm~^'^-^ + O , ifd is even, 
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where e G (0, 1), the Cj are constants with Co = 1 and 
and p, C are constants depending on d, X, u. 



4.4 An Ansatz solution in Boltzmann equations 

The following sequence t„ arose in the analysis (see [2]) of exact solutions of the Tjon-Wu 
representation of Boltzmann equations (which represent the major cornerstone of kinetic theory 
in statistical mechanics). Let z/ be a positive integer. The sequence tn is defined recursively as 

- 1) - (ri + 1) ] tn = - ^^^^-1 ^ 2), (42) 

with tf) = ti = \. This recurrence translates into the following DE for the generating function 



i/(z/ + l) 2 



z^T"{z) - zT'{z) - T{z) (1 - T{z)) = 0, (43) 



with the initial conditions T(0) = T'(0) = 1. 

Straightforward computations as above give —2 as the dominant exponent for the dominant 
term of T{z) and (r + l)(r — 6) as the resonance equation for each v = 1,2, ... . Interestingly, 
for the resonance r = 6, the two special cases z/ = 1, 2 do not lead to incompatible system of 
equations, in contrast to all higher values of u. This is very different from the cases we have 
been dealing with up to now. According to the ARS method, the cases when u = 1,2 admit 
the Painleve property §1.2, Definition 1.1] and have solutions in terms of Laurent expansion 
with two free parameters; in other words, they are integrable, and we will derive closed-form 
solutions for them. The remaining cases when u > 3 have psi-series solutions. 

Exactly solvable (integrable) case : u = 1. We start with the case u = 1. Consider the 
transformations T{z) = 1 — C^(C) ^ = ^C- Note that, by this transform, the coefficients 
[C"]^(C) positive and the transformed DE (after multiplying V'{C)) becomes 

1 o d / Vdl/^ ^ 



or equivalently. 



VC^ = VV{Cr-l, V{0) = 1. (44) 



By the relation between T{z) and V(C), we deduce that \^(0) = 1 and V'{0) = 3. Then (gH) is 
solved as 
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Let 

J — dx 
2VC^= / 2.42865 06478 8758161181 

J I vx^ — 1 

or Coo ~ 1.47458 59923 71192 48035.... Obviously ^ oo as C ^ Coo- Let A 

2(a/C» — \fC)- Then (l45l) can be written as 



A 



Since V{Q — )• oo as C — > Coo. we deduce that 

A = 2l^(C)-'/' + -V{0-^'^ + —1^(0"'^/' + — y(C)~'^/' + smaller order terms. 
6 52 152 

Consequently, by inverting the series (justified by analyticity and standard arguments), we ob- 
tain 

^4 ^10 ^16 

ViC) = 4 A ^ H 1 1 h smaller order terms. 

112 652288 5552275456 

Finally, let p := — Coo and we obtain 

= [z^]T{z) = (-l)-i[C"-i]F(C) = / C^"^(C) dC 



ICI=C<Co 



2m 



-2 



= (-iri(4n-2)C^" 
= 2(-ir-i(2n-l)|p|-", 

the errors omitted being exponentially smaller. 

Exactly solvable (integrable) case : u = 2. The case when = 2 is similar. We now adopt 
the transformations T{z) = 1 — C^-^(C) and z = —C^. Then the DE (l42l) becomes 



^^,L(C)-6L(Cr = 0^ -^-^-j -2L(cn=0 
with the initial values L(0) = and L'{0) = 1. Thus, the solution is given by 

< = / -7t=T^- (46) 

Jo Vl + 4x3 

Let Coo denote the dominant singularity of L(C). Then 

dx 2^/3 /I 1\ 

Coo = / , = Beta -, - 1.76663 87502 85449 95731 .... 

Jo V4x3 + 1 6 \Q 3J 



26 



Thus the dominant singularity of T{z) when u = 2 is 

3 

108 V6' 3 
Furthermore, from (|46|) . we have 



p = -C^ = --^Betafi ^ 1 ^-5.513701576710567 75506, 



A := Coo - C 



da; 

L{0 V4x3 + 1 ' 



and, by the same procedure as above. 



A = LiC)-'/' - ^HC)-'/' + -±^L{0-''/' - Y^HC)-''^' + smaller order terms, 
for C ~ Coo- By inverting the expansion 

r..X A -2 A^ 3A22 

L C = A 1 1 smaller order terms. 

28 10192 5422144 9868302080 



Accordingly, 



27rz 

3(-l)' 



27ri 
3(-l)"-i 
27r^ 



kl=c<|p| 

C-3--^Ti-e)dC 



|C|=c'<|p|l/3 

C-^"+iL(C)dC = 3(-i)"-MC'"-']^(C) 

ICI=c'<Coo 



~3(-ir-i[n"] (Coo-cr 
= 3(-ir-i(3n-i)c^^" 

= 3(-ir-i(3n-l)|pr". 

Note that we can use the transforms = C^ and T(z) = 1 — l^(C)C^ to convert the DE for 
z/ = 1 to a DE of same type (differing only by a constant) as the case for u = 2. Also both 
solutions can be expressed in terms of Weierstrass p functions. 

The rest cases : z/ > 3. Unlike the preceding two cases, the rest z/'s no longer lead to DEs 
that are solvable by quadratur^. Due to incompatibility, we apply again the psi-series method. 
Because of the negative sign on the right-hand side of (|42l) . we consider the transform z = — C 
andT(z) = 1 - CV'(C)- Then 

t^ = [z-]T{z) = {-!)"-' [C-'] 

and (|43l) is translated into 



'a DE is said to be solvable by quadrature if its solution can be expressed in terms of one or more integrations. 
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Let now Z = 1 — (/p, where p > is the dominant singularity of V (having all Taylor 
coefficients positive). Then we deduce the psi-series expansion for V 



u + 2 5(. + 2) ^ 

+ KZHogZ + {\Z\^\\ogZ\) 



where 

(z/ - l)(z/ - 2)(z/ + 3)(z/ + 4)(2z/ + l)(2z/ + 3)(3z/ + 2)(3z/ + 4) (z/^ + 2i/ + 2) 



43750z/5(z/ + l)5(z/ + 2) 



This, together with the approximations we derived for t„ in the two cases u = 1,2, implies the 
following asymptotic s of t„. 

Theorem 8 The sequence tn satisfies the asymptotic expansion 

,n-u _ .-nY6z/(z/ + l)^ 6{u' + 2u + 2) , r 0((1 ifiy = 1,2; 



' l ^ + 2 5(z/ + 2) ^ \ 24irn-5 + 0(n-6), i/z/ > 3. 

(47) 

Note that K = when z/ = 1, 2. 



5 Conclusions 

Through the examples we studied in this paper, we see that the psi-series method is a power- 
ful approach to handling nonlinear DEs and yields several surprising results, notably asymp- 
totic expansions with the first few terms missing. While psi-series have long been used in 
many branches of mathematics and physics, little attention has been paid to the corresponding 
asymptotics of the coefficients. Also the procedure we adapted and improved from Hille for 
proving the absolute convergence of psi-series is of certain generality and can be applied to 
other problems of similar nature. 

Another feature of the recurrences we studied in this paper is that they are very sensible to 
small variations, the example of d random BSTs being typical. Note first that the recurrence 
(l24l) with d = yields the well-known Catalan numbers and the case d = 1 gives rise to the 
trivial sequence pn = I- The case d = 1 in a more general form was studied by Wright [33 J; 
see also Cooper [TOl for a study of pn for real A; > 0. 

We now compare the recurrence (l24l) with the following one by defining pi = 1 and 

Pn = rT'^ ^ PjPn-j {n>2). 

i<j<n-i 

While the case = still yields the Catalan numbers with their generating function satisfying 

P{z)-z = P^{z), 
the case d = 1 becomes a nonlinear differential equation of Riccati type 

zP'{z) - z = P'^{z), P(0) = 0, 
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which can still be explicitly solved P{z) = z ^/^Ji{2z^/^)/Jo{2z^/^), where [zYs are Bessel 
functions (see ll23l ). The case = 2 is again of Emden-Fowler type and can be solved asymp- 
totically by psi-series method as well as the remaining cases d > 3. 

See fir0l[T6l[T8ll24l[32l [331 and the references therein for some quadratic recurrences of the 
above "Faltung" type. More examples can be found in the recent papers L3,4J- 
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Appendix 



Al. Proof of the absolute convergence of psi-series 

In this Appendix, we group the details of the proof of the absolute convergence of the psi-series 
arising in the three cases: d random BSTs, two random m-ary search trees, and two random 
median-of-(2t + 1) BSTs. We first describe briefly the general pattern of the proof and then 
provide more details for each case. 

Our proof begins with rewriting the original DE in z into a system of linear DEs in Z — 
1 — z/p of the form 



dZ 



U{Z) = X{Z,V), V{Z) 




(A.1) 



where s e {d,2{m - l),At + 2}. Here Uj{Z) = Efc>o ^^^(^)^""^''"^^^ where a is the 
leading order, r = log Z and X : C*"*"^ C^. Then we derive the infinite system of linear DEs 
satisfied by the u^^'s 



( \ 



where A^ 
1 

the form 



A'T.s 



M and M G 



are s X s matrices. 



(A.2) 



In terms of such an infinite system, an upper bound for all m^' (in particular, for ij}^) is of 



u 



for T e ^ 

^ -.^{i^iQ-.i^ (-00, -e\ and \d\ < n} , (A.3) 

with 1 1 — t| > 1 + e, where X is a constant and ijj{k) , c(s) depend on the problem in question. 
Then the absolute convergence can be justified. 

An additional common and interesting feature this approach brings is that the resonance 
equation will be seen to be equal to dct(rlsxs — M). We will explain this in more details. 

The following relations are useful in converting our DEs in z into those in Z (D = d/dz). 



z^p{l-Z), zB^-il-Z)' 



dZ' 



d' 

dZ^' 
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Equality of d random BSTs. The corresponding system (lA.ll) for (|25l) is 



1 < J < c?, 



1-Z ' 

t^^(^) = (-l)'^pt/i(^)^. 
The associated coefficient matrices and in (IA.2I) . A; > 3(i + 1, are given by 



Ak = kl 



dxd 



M, M 



d 




1 

d+1 1 



\ 





2rf - 2 1 

2d - I J 



and 



gfc 



0<^<fc 



0<i<k 



1<£<A: 



P y G C such that 



Due to the existence of complex-conjugate roots, we can find a d x d matrix P with entries 

... \ 

k-r^ 



f k + 1 
k-3d 



PA.P 1 



k — 



\ 



■•• 

rd / 



for G N. By the same norm and same arguments used for two random BSTs, we derive the 
inequality (Crf := ||P|| ||p-i||) 



max 

l<j<d 



< \\4>k 



0<l<k 



\ 



i<e<k 



o<e<k 



(A.4) 



dx. 



Again, by same the arguments used to prove (flTl) . we have. 



<K{l + k)-^/^\l-T\ 



k-3d 



'l<J<d,k>0), 



for r G e^. 
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The resonance polynomial equals det [rldxd — M) . Direct calculations give the determinant 



{2d-l-r)\ {2d)\ 
{d-l-ry. dT' 



det (rlrfxd - M) 



which is nothing but the resonance polynomial (1261) . 

The reason that the two polynomials are equal is as follows. The distinction between Lau- 
rent expansion and the psi-series expansion depends crucially either on the existence of positive 
integer resonance or on whether a relation such as (fT2l) holds for all k. This is equivalent to 
asking whether the linear system A^cj^k = gfe is solvable or not for all k. If the system (IA.2I) 
Ak4>k = gfc is solvable under the condition det 7^ for all k, then by the uniqueness of the 
solution of (IA.2I) . the solution vectors (/)fc's are constant vectors (independent of r) and in turn, 
the series solution Ui{Z) = X]fc>o Uk^Z~'^~^''~^^^ is eventually a Laurent's series. On the other 
hand, if det A^^ 7^ fails to hold for some Ajq, then we have the following two cases. 



— The linear system Afc„</) = gko has a solution depending on the d — rank(Afc(,) free 
parameters, and all the rest constant coefficient vectors (f)k depend on at least these pa- 
rameters. 



The linear system is inconsistent. Hence it can no longer provide a solution to (|A.2I) . The 
real solution should be solved from (IA.2I) instead and then all the vector functions </)fc(r), 
k > ko, depend on r. Moreover, the resulting solution Ui{Z) = Z^'^ Z]fc>o u^k^Z'^^^^^ 
indeed a psi-series. 



IS 



In particular, we see that the characteristic polynomial det(rl(^xd 
polynomial (|26|) that determines all the possible resonances. 



M) is the same as the 



Equality of two random m-ary search trees. The transformed first-order differential system 
in terms of Z for (1281) now has the form 



U[iZ) = U2iZ) 

u:^_,{z) = {i-z)-('-^-'^uuz), 
u^^_,{z) = {m-iy'p"^-'u,izr. 



l<j<m-2, 
l<J<m-2, 



So that the corresponding infinite system (IA.2I) has the coefficient matrix A^ = A;l2(m-i)x2(m-i) ■ 
M, where 



M 



2 




1 ••• 

3 1 
4 1 





\m{2m-l)\ 



m 1 

m + 1 1 



\ 





2m - 2 1 

2m - 1 / 
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and the vector- valued function 



E 

0<j<fe 



m — 





2 + ^ ~ J \ \m\ 







p™-i(m- 1)!' 







[1] [1] [1] 

'■1 '■2 



V 



n+«2H \-im=k 

0<ii<k 



J 



Then similar arguments as those used for (flTl) leads to the upper bound 



(1 < J < 2(m - 1), A; > 0), 



for T E 3^, where the constant K is easily tuned according to the initial conditions. 



Equality of two random median-of-(2t + 1) BSTs. The linear differential system of 4t + 2 
equations of (l30l) is 

U'AZ) = Uj+,{Z), l<j<2t, 



U;{Z) = U,+,{Z), 2t + 2 < J < 4t + 1, 



U2t+l-ii{Z)U2t+l-i2{Z) , 



0<ii,i2<t 



where 



Zl!z2!(t-2l)!2(t-Z2) 

Let Uj{Z) = '£k>o u^k{r)Z''-^-^ for 1 < j < 4t + 2, where 



12' 



(l<J<2t + l). 



Then coefficient matrix = /i;l4t+2x4t+2 — M in (IA.2I) . /c > 4t + 2, is given by 



/ 2 1 
3 1 



M 



\ 




2t + l 








1 

2t + 2 




2(4t+3)! 
(2t+l)! 
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1 

2t + 3 





4t + 2 1 

4t + 3 



and the vector- valued function gk by 

/ 



gfc 



0<j<k ^ J / 





(2t + l)!- 



pti u) 



where 



n u 



0<j<k 
0<i<t 



2t 



\ 



0<j<k 
0<i<t J ) 



o<e<k-j ^-^^ 

l<j<min{fc,2t} 



k—s—j—£ 



l<s<min{fc,2t} 
0< j<min{s,t} 



0<£<k-j 
l<j<m\n{k—s,2t—s} 



The the same method of proof used for (flTl) yields the upper bound (ro = 2 or 6t + 6) 



(t) <C(l + A;)-i/2|i-r|'=-"«, (1 < J < 4t + 2,A; > 0), 



uniformly for r G where the constants C and K are easily tuned according to the initial 
conditions. 



A2. Proof of the incompatibility of the resonance r = 2 for random median- 
of-(2^ + 1) BSTs 

Since the resonance r = 2 does not depend on t, the incompatibility of the resonance r = 2 
can be directly checked, which we now do. Let U (Z) := F[z), where F satisfies the DE (l30l) 
and Z = 1 — z/ p. Then the DE (l30l) can be rewritten as 

((1 - Z)2*+if/(2*+i)(Z))^'*^'^ = Ct,p (((1 - Z)*f/W(Z))^*y , (A.5) 

where all derivatives are with respect to Z and Ct,p := {2t + 

Consider the formal Laurent expansion f{Z) = X]fc>o^fc^'' °- Then for any s G N, we 
have 

((1 - Zyf^\Z)) = J2{k-a- s)^Z'~'^-" (-1)' (') - " - ^■)^'^-^' 

fc>0 0<j<s ^-^^ 
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where Uj := 0, j < 0. Substituting this into (IA.5I) . we have 

fc>0 0<i<2t+l \ J / 

k>0 o<e<k 

where 



Xfc = (fc - a - 1)^ 5^ ("^"j (fc - « - J 



o<j<t 

Equating the dominant term (with A; = 0) leads to the obvious solution a = 2. Consider now 
the relation 

o<j<2t+i V J / o<e<k 

Fovk = 0,wegetpMo = (4^ + 3)!^!7(2^ + l)!^ andfor A; = 1, weget^i = -2{t + l)^uo/{Qt + 
5). Now for = 2, we have 

|2 / / /OJ- I T\U I 1\/ M I j-2/. I i\2\ „,2 , „,2 



= Ct,pi2ty/ \^\^{2t+l){t + l)\^^j +t\t + iyj 4 + ui - t{At + 3)uoUi 
+ (4t + l)!(2t + 1)2^1 - (4t + l)!(2t + l)(2t + 2) (^^^ ^)«o 

(4t + 2)!(t+l) ,4 n . , , 

= -^-^TT^r^^uo 216t' + 522t=^ + 437t' + 141t + 12) ^ 0, 
4(6i(: + 5)"' ^ ^ 

since t > 1. This proves the incompatibility of the resonace r = 2 for alH > 1. 

A3. Asymptotics of the moment generating function 

We prove (l35l) . starting from Hankel's integral representation of the Gamma function 

Ho 



T(w) 2Tci 



where "Ho starts at — oo, encircles the origin once counter-clockwise and returns to its starting 
point. For definiteness, we may take 

no = {s = xe^'^ : Rq<x <oo}VJ{s = Rqc'^ : -n < 9 < n} (Rq > 0). 

This gives 

M{z) = — [ e's-'A{z/s°-')ds, 

where A{z) satisfies the DE (l32l) . Note that M is an entire function of order 1/v > 1 and of 
type p"^/^. 
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Let z = \z\e'"^, \z\ > and \(p\ < vn/2, where v = (vTf — 3)/2. The condition on 
arg z implies that the dominant singularity s = (z/ pY^'^' of the integrand lies in the half -plane 
> (in which ^ oo with z). On the other hand, if | arg(— < vr — v7c/2, then one 
expects that M(z) — )■ with z, but the exact determination of the rate is more delicate. The 
situation here is similar to the Mittag-Leffler function J2j>o -^"'/^(ctj + 1); see [fT3l Ch. 18.1]. 

The change of variables z/s'" s gives 



H 



2t[w 

where "Hi is the cut circle described by 

ni = {s = xe^'^^™" : < X < U {s = i?ie^'^+™^ : -vr < ^ < vr}. 

Here < i?i < p. We then approach in a way similar to the singularity analysis (see IfTTll ) by 
deforming the contour Hi into I-L2, where 7^2 is of the same shape as Hi but with larger radius 
for the circular part \s\ = R2 = p + e and avoiding the cut from s = p to 00 (in the style of 
ifTTl). Symbolically, 

U2 = {s = xe'^^^"" ■ < X < i?2} 

U {s = i?2e^'^+™^ : -TT < < vr and 1^ - Lplv\ > e,} 

u r„ 

where €z = and Tp is any contour joining the two points i?2e~*^^ and -R2e*^'= and lying 

inside the cut region described by other parts of ^2. 

The remaining analysis is then easy because the main contribution to M(z) comes from Tp 
on which we can apply the local expansion (1331) of A{z), the other parts being negligible 



2'Kiv V 



By making first the change of variables p(l — s) 1— )• s, using the expansion (l33l) . and then 
another change of variables [z/ pY^^s/v ^ s, we deduce that 




where Tq denotes the transformed contour of Tp and the cys are polynomials of s whose exact 
values matter less. Extending the contour to infinity and then evaluating the individual terms 
by Hankel's integral representation of the Gamma function, we obtain 

where we also used the formula 



/ 6^s log s ds = 

27ri dxr(x) 



-24. 



This completes the proof of (1351) . 
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